Genome rearrangement algorithms
نویسنده
چکیده
With the increasing amount of sequenced genomes, a comparison of species based on these data becomes more and more interesting. In contrast to the classical approach, where only point mutations were considered, genome rearrangement problems ignore small mutations and only consider large-scale mutations that change the gene order on the chromosomes. This makes these problems a powerful tool when studying organisms that have diverged millions of years ago, or very fast evolving genomes, like those of cancerous cells. A large variety of problems arises from genome rearrangements, like the calculation of evolutionary distances, the reconstruction of evolutionary events and the gene order of hypothetical ancestors, or even the reconstruction of whole phylogenetic trees. Due to the high complexity of most of these problems, the algorithms are based on highly simplified models of the reality. For example, most algorithms consider only a single type or a small set of evolutionary events. Therefore, one biological problem results in several algorithmic problems, depending on the chosen model. Genome rearrangement problems are often very challenging. For many problems, it is not known whether they can be solved exactly in polynomial time w.r.t. the size of the genomes (like sorting by transpositions), others are known to be NP-hard even for the most simple models, like the median problem or the multiple genome rearrangement problem. For some problems, their complexity depends on the chosen model. Many of the problems can be solved in polynomial time if every gene occurs in each genome exactly once, but become NP-hard as soon as duplications are allowed. From a computer scientist's point of view, this raises both theoretical and algorithmic tasks. For each problem, it is desirable either to find an exact algorithm or to prove that the problem is NP-hard. In some cases, the examination of a simplified version of the problem and the connection between this simplified problem and the actual problem can be helpful (actually, this approach helped Hannenhalli and Pevzner to obtain their famous result about sorting by reversals). If a problem has been shown to iii Summary be NP-hard, still many instances of the problem can be solved exactly and efficiently by clever algorithms. Where even this strategy fails, we seek for efficient heuristic algorithms. The major results contained in this thesis are as follows: • We provide a linear time transformation from an arbitrary permutation into its equivalent simple permutation, and an O(n log …
منابع مشابه
Algorithms in Bioinformatics: A Practical Introduction Genome Rearrangement Evidences of Genome Rearrangement
متن کامل
Comparative genomics: multiple genome rearrangement and efficient algorithm development
vi CHAPTER 1. OVERVIEW 1 1.1 Problem and motivation 1 1.2 Previous algorithms 4 1.3 Proposed algorithms 6 1.4 Further work in the future 9 1.5 Dissertation organization 11 References 11 CHAPTER 2. Multiple Genome Rearrangement by Reversals 15 2.
متن کاملGenome-scale evolution: reconstructing gene orders in the ancestral species.
Recent progress in genome-scale sequencing and comparative mapping raises new challenges in studies of genome rearrangements. Although the pairwise genome rearrangement problem is well-studied, algorithms for reconstructing rearrangement scenarios for multiple species are in great need. The previous approaches to multiple genome rearrangement problem were largely based on the breakpoint distanc...
متن کاملOptimization Problems from Genome Sequence Rearrangement
Optimization Problems from Genome Sequence Rearrangement Devin Henson A problem of great interest is the problem of computing the evolutionary distance between two organisms using their genomic data. We model a genome as a permutation with the genes as elements. Given two permutations, we survey the known results for the shortest sequence of rearrangement operations that transforms one permutat...
متن کاملMultiple Genome Rearrangement by Reversals
In this paper, we discuss a multiple genome rearrangement problem: Given a collection of genomes represented by permutations, we generate the collection from some fixed genome, e.g., the identity permutation, in a minimum number of signed reversals. It is NP-hard, so efficient heuristics is important for finding its optimal solution. We at first discuss how to generate two and three genomes fro...
متن کاملGenome Halving and Double Distance with Losses
Given a phylogenetic tree involving whole genome duplication events, we contribute to solving the problem of computing the rearrangement and double cut-and-join (DCJ) distances on a branch of the tree linking a duplication node d to a speciation node or a leaf s. In the case of a genome G at s containing exactly two copies of each gene, the genome halving problem is to find a perfectly duplicat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011